Methods and physical computer storage media for transferring de-duplicated data organized in virtual volumes to a target set of physical media

ABSTRACT

For forming an initial bitmap from deduplicated data on virtual volumes, discrete blocks are sorted according to frequency of occurrence to form a revised bitmap to first include a plurality of most common discrete blocks. A physical volume map is created from the revised bitmap. An initial virtual volume of the virtual volumes contained on a corresponding original physical volume is reviewed to determine whether moving the initial virtual volume to a different physical volume reduces the total number of data blocks in the physical volume map. The initial virtual volume is deleted from its corresponding original physical volume and added to the different original physical volume to create a revised physical volume map including revised physical volumes. The revised physical volume is written to the target set of physical media using the revised physical volume map.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates in general to de-duplicated data andmethods of transferring de-duplicated data that is organized in virtualvolumes to a target set of physical media.

2. Description of the Related Art

Data deduplication is a data storage method for eliminating or reducingredundant data. In particular, data deduplication allows one uniqueinstance of the data to be retained on storage media rather thanmultiple instances of the same data by replacing the multiple instancesof the same data with a pointer to a single instance. In this regard,data stored into a deduplication system is analyzed and broken into“blocks.” Duplicate blocks are identified, each unique block is given adigital signature to show that the two blocks with the same signatureare the same data, and the duplicate blocks are eliminated. Normally, asthe deduplication system breaks apart data comprising larger objectsinto the individual blocks, the system tracks the individual blocks sothat the corresponding larger object can be retrieved when desired. Forexample, an index or database is provided to track the blocks.

The deduplication system may be embodied in a storage management systemthat spans multiple storage volumes and storage pools. For example, datamay be sent by storage management clients or data protection agents tothe storage management server for storage. In this regard, a storagemanagement system typically stores copies of objects on separate media,so that a data set that is too large to fit on a single volume can bestored. Alternatively, data can be moved from one storage location toanother, either within the same storage pool or between storage pools orbetween different media, such as between a disk and tape, which storedifferent amounts of data.

SUMMARY OF THE INVENTION

Many virtual volumes are stored in a large, homogeneous storage pool.Typically, the storage is disk-based, and the virtual volumes are notindividually de-depulicated but rather are de-depulicated as part ofentry to the storage pool. When there is a need to write these virtualvolumes out to physical media, difficulty may arise where each item ofphysical media is a small fraction of the size of the original storagepool. Thus, it is desirable to have a method and system for transferringde-duplicated data that is organized in virtual volumes into a targetset of physical media.

In an improved method, steps include forming an initial bitmap from thede-duplicated data on virtual volumes, the de-duplicated data on virtualvolumes comprising a total number of discrete blocks, sorting thediscrete blocks according to frequency of occurrence to form a revisedbitmap to first include a plurality of most common discrete blocks,creating a physical volume map from the revised bitmap, the physicalvolume map associating each discrete block of the de-duplicated datablocks on a virtual volume with an original physical volume andincluding a total set of original virtual volumes and total number ofdata blocks, reviewing, from the physical volume map, an initial virtualvolume of the virtual volumes contained on a corresponding originalphysical volume, to determine whether moving the initial virtual volumefrom its corresponding original physical volume to a different physicalvolume in the total set of original physical volumes reduces the totalnumber of data blocks in the physical volume map without exceeding atotal number of allowed blocks on the different physical volume,deleting the initial virtual volume from its corresponding originalphysical volume and adding the initial virtual volume to the differentoriginal physical volume to create a revised physical volume mapincluding revised physical volumes, and writing the revised physicalvolumes to the target set of physical media using the revised physicalvolume map.

In another embodiment, by way of example only, a physical computerstorage medium comprises a computer program product method fortransferring de-duplicated data that is organized in virtual volumes toa target set of physical media. The physical computer storage mediumincludes computer code for forming an initial bitmap from thede-duplicated data on virtual volumes, the de-duplicated data on virtualvolumes comprising a total number of discrete blocks, computer code forsorting the discrete blocks according to frequency of occurrence to forma revised bitmap to first include a plurality of most common discreteblocks, computer code for creating a physical volume map from therevised bitmap, the physical volume map associating each discrete blockof the de-duplicated data on a virtual volume with an original physicalvolume and including a total set of original virtual volumes and totalnumber of data blocks, computer code for reviewing, from the physicalvolume map, a first virtual volume of the virtual volumes contained on acorresponding original physical volume, to determine whether moving thefirst virtual volume from its corresponding original physical volume toa different physical volume in the total set of original physicalvolumes reduces the total number of data blocks in the physical volumemap without exceeding a total number of allowed blocks on the differentphysical volume, computer code for deleting the first virtual volumefrom its corresponding original physical volume and adding the firstvirtual volume to the different original physical volume to create arevised physical volume map including revised physical volumes, andcomputer code for writing the revised physical volumes to the target setof physical media using the revised physical volume map.

BRIEF DESCRIPTION OF THE DRAWINGS

In order that the advantages of the invention will be readilyunderstood, a more particular description of the invention brieflydescribed above will be rendered by reference to specific embodimentsthat are illustrated in the appended drawings. Understanding that thesedrawings depict only typical embodiments of the invention and are nottherefore to be considered to be limiting of its scope, the inventionwill be described and explained with additional specificity and detailthrough the use of the accompanying drawings, in which:

FIG. 1 is a block diagram of a data storage system, according to anembodiment; and

FIG. 2A is a schematic of storage controller communicating with datastorage media, according to an embodiment

FIG. 2B is a schematic of an information storage and retrieval system,according to an embodiment; and

FIG. 3 is a flow diagram of a method of transferring de-duplicated datathat is organized in virtual volumes to a target set of physical media,according to an embodiment.

DETAILED DESCRIPTION OF THE DRAWINGS

The illustrated embodiments below provide an improved method fortransferring de-duplicated data that is organized in virtual volumes toa target set of physical media. The method includes forming an initialbitmap from the de-duplicated data on virtual volumes, the de-duplicateddata on virtual volumes comprising a total number of discrete blocks,sorting the discrete blocks according to frequency of occurrence to forma revised bitmap to first include a plurality of most common discreteblocks, creating a physical volume map from the revised bitmap, thephysical volume map associating each discrete block of the de-duplicateddata blocks on a virtual volume with an original physical volume andincluding a total set of original virtual volumes and total number ofdata blocks, reviewing, from the physical volume map, an initial virtualvolume of the virtual volumes contained on a corresponding originalphysical volume, to determine whether moving the initial virtual volumefrom its corresponding original physical volume to a different physicalvolume in the total set of original physical volumes reduces the totalnumber of data blocks in the physical volume map without exceeding atotal number of allowed blocks on the different physical volume,deleting the initial virtual volume from its corresponding originalphysical volume and adding the initial virtual volume to the differentoriginal physical volume to create a revised physical volume mapincluding revised physical volumes, and writing the revised physicalvolumes to the target set of physical media using the revised physicalvolume map.

FIG. 1 is a block diagram of a data storage system, according to anembodiment. Data processing system 100 comprises storage controller 120and data storage media 130, 140, 150, and 160. In the illustratedembodiment of FIG. 1, storage controller 120 communicates with datastorage media 130, 140, 150, and 160, via I/O protocols 132, 142, 152,and 162, respectively. I/O protocols 132, 142, 152, and 162, maycomprise any sort of I/O protocol, including without limitation a fibrechannel loop, SCSI (Small Computer System Interface), iSCSI (InternetSCSI), SAS (Serial Attach SCSI), Fibre Channel, SCSI over Fibre Channel,Ethernet, Fibre Channel over Ethernet, Infiniband, and SATA (SerialATA).

As used herein, the term “data storage media” is defined as aninformation storage medium in combination with the hardware, firmware,and/or software, needed to write information to, and read informationfrom, that information storage medium. In certain embodiments, theinformation storage medium comprises a magnetic information storagemedium, such as and without limitation a magnetic disk, magnetic tape,and the like. In other embodiments, the information storage mediumcomprises an optical information storage medium, such as and withoutlimitation a CD, DVD (Digital Versatile Disk), HD-DVD (High DefinitionDVD), BD (Blue-Ray Disk) and the like. In still other embodiments, theinformation storage medium comprises an electronic information storagemedium, such as and without limitation a PROM, EPROM, EEPROM, FlashPROM, compactflash, smartmedia, and the like. In still yet otherembodiments, the information storage medium comprises a holographicinformation storage medium.

Storage controller 120 communicates with host computers 102, 104, and106. Generally, hosts computers 102, 104, and 106, each comprises acomputing system, such as a mainframe, personal computer, workstation,and combinations thereof, including an operating system such asWindows®, AIX®, Unix®, Z/OS®, LINUX®, etc. (Windows® is a registeredtrademark of Microsoft Corporation; AIX® is a registered trademark andZ/OS® is a trademark of IBM Corporation; UNIX® is a registered trademarkin the United States and other countries licensed exclusively throughThe Open Group; and LINUX® is a registered trademark of Linus Torvald).One or more of host computers 102, 104, and/or 106, further includes astorage management program. In accordance with an embodiment, thestorage management program may include a functionality of storagemanagement type programs known in the art that manage the transfer ofdata to and from a data storage and retrieval system, such as forexample and without limitation the IBM DFSMS implemented in the IBMZ/OS® operating system.

Storage controller 120 comprises processor 128 and computer readablemedium 121, microcode 122 written to computer readable medium 121,instructions 124 written to computer readable medium 121, a first stagehash algorithm 123 written to computer readable medium 121, and a secondstage hash algorithm 125 written to computer readable medium 121.Processor 128 utilizes microcode 122 to operate storage controller 120.In the illustrated embodiment of FIG. 1, storage controller 120 furthercomprises deduplication queue 126. Processor 128 performs certainoperations related to data received from one or more host computers,such as for example and without limitation data deduplication.

In an embodiment, host computers 102, 104, and 106, are connected tofabric 110 utilizing I/O protocols 103, 105, and 107, respectively. I/Oprotocols 103, 105, and 107, may be any type of I/O protocol; forexample, TCP/IP, NFS, CIFS, FTP, HTTP protocols, a Fibre Channel (“FC”)loop or Fibre Channel over Ethernet, a direct attachment to fabric 110or one or more signal lines used by host computers 102, 104, and 106, totransfer information to and from fabric 110.

In certain embodiments, fabric 110 includes, for example, one or more FCswitches 115. Those one or more switches 115 comprise one or moreconventional router switches, in an embodiment. In the illustratedembodiment of FIG. 1, one or more switches 115 interconnect hostcomputers 102, 104, and 106, to storage controller 120 via I/O protocol117. I/O protocol 117 may comprise any type of I/O interface, forexample, a Fibre Channel, Infiniband, Gigabit Ethernet, Ethernet, FibreChannel over Ethernet, TCP/IP, iSCSI, SCSI I/O interface or one or moresignal lines used by FC switch 115 to transfer information to and fromstorage controller 120, and subsequently data storage media 130, 140,150, and 160. In other embodiments, one or more host computers, such asfor example and without limitation host computers 102, 104, and 106,communicate directly with storage controller 120 using I/O protocols103, 105, and 107, respectively.

FIG. 2A is a schematic of storage controller 120 communicating with datastorage media 130, 140, 150, and 160, according to an embodiment. In anembodiment, communication occurs using a fibre channel arbitrated(“FC-AL”) loop of switches, wherein controller 120 and media 130, 140,150, and 160, are disposed in information storage and retrieval system200. As those skilled in the art will appreciate, information storageand retrieval system 200 further comprises additional elements, such asand without limitation one or more host adapters, one or more deviceadapters, a data cache, non-volatile storage, and the like. Theillustrated embodiment of FIG. 2A should not be construed to limit theinvention to use of fibre channel networks or devices. In otherembodiments, other network topologies and devices are utilized,including without limitation SAS devices and/or SATA devices.

FIG. 2B is a schematic of an information storage and retrieval system202, according to an embodiment. The system 202 comprises dual FC-ALloops of switches wherein storage controller 120A and storage controller120B are interconnected with both FC-AL loops. Each FC-AL loop containsone or more local controllers, such as local controllers 210, 220, 230,240, 250, and 260. As those skilled in the art will appreciate,information storage and retrieval system 200 further comprisesadditional elements, such as and without limitation one or more hostadapters, one or more device adapters, a data cache, non-volatilestorage, and the like. Each storage controller is in communication witha first plurality of data storage media 270, a second plurality of datastorage media 280, and a third plurality of data storage media 290. Theillustrated embodiment of FIG. 2B should not be construed to limit theinvention to use of fibre channel networks or devices. In theillustrated embodiment of FIG. 2B, the recitation of two FC-AL loopscomprises one embodiment of Applicants' apparatus. In other embodiments,other network topologies and devices are utilized, including withoutlimitation SAS devices and/or SATA devices.

Each plurality of data storage media 270, 280, 290 represents a physicalvolume (P_(n)) on which data is stored. The data on the physical volumes(P_(n)) are treated as blocks. Many times, although the data blocks arelocated on particular physical volumes, the blocks are grouped togetherlogically into virtual volumes (V_(n)) and thus, can span over multipledisks. However on tapes that are physical volumes, it is desirable forvirtual volumes not to span tapes because retrieving one virtual volumecould require multiple tape mounts.

Because some of the blocks may be redundant, the data may bede-depulicated to eliminate the redundant data, leaving only one copy ofthe data to be stored. Thus, to optimize storage, all discrete blocks(B_(n)) are ideally stored on as few physical volumes (P_(n)) aspossible. Accordingly, an absolute ideal number of physical volumes isequal to the number of discrete blocks B_(n) divided by the number ofblocks that can be stored on each P_(n) so that each physical volume(e.g., data storage media 270, 280, 290) is completely full of discreteblocks.

FIG. 3 is a flow diagram of a method 300 of transferring de-duplicateddata that is organized in virtual volumes to a target set of physicalmedia, according to an embodiment. Method 300 can be employed to achievethe above-described ideal storage. In an embodiment, after the start302, a list of discrete blocks (UB_(n)) is sorted into a bitmapaccording to frequency of occurrence, block 304. In an embodiment, theinitial bitmap is formed from the de-duplicated data that has beenorganized into virtual volumes, where the de-duplicated data organizedinto the virtual volumes comprise a total number of discrete blocks.According to an embodiment, a set of virtual volumes can have a total Nnumber of discrete blocks, (B_(n1) . . . (B_(nN)). A first virtualvolume may have some of the discrete blocks and may contain B_(n1),B_(n2), B_(n3), B_(n7), and B_(n8). A resulting bitmap for the firstvirtual volume may be 11100011. A second virtual volume may have some ofthe discrete blocks and may contain B_(n1), B_(n2), B_(n3), B_(n5),B_(n6), B_(n7), and B_(n8). In such case, the resulting bitmap for thesecond virtual volume may be 11101111. After bitmaps are created for allof the virtual volumes of the set of virtual volumes, the bitmaps arecompared to determine a frequency of the discrete blocks as they appearin each virtual volume. For instance, the first and second virtualvolumes may be compared:

-   -   11100011 (first virtual volume)    -   11101111 (second virtual volume)        The discrete blocks of the bitmaps are sorted according to        frequency of occurrence to form a revised bitmap to first        include a plurality of most common discrete blocks, in an        embodiment. As shown above, B_(n1), B_(n2), B_(n3) are the most        common discrete blocks in the first and second virtual volumes,        and thus could be eliminated from consideration to improve speed        of the process and/or to form the revised bitmap. In another        embodiment, the initial bitmap can be created by any one of        numerous processes.

Next, a strawman memory map is created and sorted according to thebitmap, block 306. For example, the strawman memory map (also referredto herein as “a physical volume map”) is created from the revisedbitmap. The physical volume map associates each discrete block of thede-duplicated data on a virtual volume with an original physical volumeand including a total set of original virtual volumes and a total numberof data blocks. In an embodiment, the physical volume map is an inmemory map. In other embodiments, the physical volume map is stored orreferenced from another area where mapping can be stored. To improveprocessing at block 306, each original physical volume associated withthe physical volume map is sorted by corresponding block bitmaps intodescending order, where a plurality of most common data blocks appearsto form the physical volume map. As a result, physical volumes in themap with the most common blocks group together at the top of the listand are more likely to be considered, and thus eliminated, first. In anembodiment, a subset of data blocks in the plurality of data blocks maybe identified, where the subset of data blocks has a common bit patternso that portions of each data block in the subset not including thecommon bit pattern can be compared in subsequent steps.

In any case, after the strawman memory map is created, a determinationis made as to whether a stopping condition has been met, block 308. Inan embodiment, the stopping condition is met after a number ofiterations (e.g., a maximum number of allowed loops) has been performed.In another embodiment, the stopping condition is met after apredetermined number of iterations is performed without an occurrence ofa deletion of a reviewed physical volume. In another embodiment, thestopping condition is met when an ideal total number of physical volumesor a predetermined percentage of a total number of original physicalvolumes has been achieved. In yet another embodiment, the stoppingcondition is met after a predetermined duration of time. If the stoppingcondition has been met, then each physical volume is written out tophysical media (e.g., data storage media 270, 280, 290) according to thememory map listed, block 310. Specifically, revised physical volumes arewritten to the target set of physical media using a revised physicalvolume map, which will be described in more detail below. The method 300then stops at block 312.

If a determination has been made that the stopping condition has notbeen met, then another determination is made as to whether all physicalvolumes have been examined, block 314. If so, a physical volume pointeris set to a first physical volume, block 316, and the method 300iterates from block 308.

If all physical volumes have not been examined, then a determination ismade as to whether all of the virtual volumes on a current physicalvolume have been examined, block 318. If so, a physical volume pointeris set to a next physical volume, block 320, and the method 300 iteratesat block 314.

If not, a comparison is made between a block bitmap of the virtualvolumes against a block bitmap of all of the physical volumes other thanthat of the current physical volume, block 322. For example, a bitmap ofan initial virtual volume (e.g., a bitmap of a virtual volume that hasnot been examined) is compared with each bitmap of each originalphysical volume in the total set of original physical volumes.

Next, a determination is made as to whether moving a virtual volume to adifferent physical volume can improve mapping, block 324. In particular,the determination is based on whether moving the initial virtual volumefrom an original physical volume with which it is associated to a newphysical volume will improve mapping. Improving mapping can be achievedby reducing a number of physical volumes or reducing the total number ofblocks required to be store on the original physical volumes plus thetotal number of blocks required to be stored on the different physicalvolume and after the move, neither the original physical volume nor thedifferent physical volume contain more blocks than the maximum number ofblocks which can be stored on the physical volume. In an embodiment, areview is made of the initial virtual volume of the virtual volumescontained on a corresponding original physical volume from the physicalvolume map, to determine whether moving the initial virtual volume fromits corresponding original physical volume to a different physicalvolume in the total set of original physical volumes reduces the totalnumber of data blocks in the physical volume map without exceeding atotal number of allowed blocks on the different physical volume. Afterthe initial virtual volume is examined, a subsequent virtual volume ofthe virtual volumes contained on a corresponding original physicalvolume then is reviewed from the revised physical volume map, todetermine whether moving the subsequent virtual volume from itscorresponding original physical volume to a different original physicalvolume or revised physical volume reduces the total number of datablocks in the revised physical volume map without exceeding a totalnumber of allowed blocks on the different original physical volume orrevised physical volume, and so on. In still another embodiment,reviewing comprises identifying consolidation targets in the virtualvolumes, wherein the consolidation targets comprise common data blocks,and determining whether moving the common data blocks from correspondingoriginal physical volumes to different physical volumes in the total setof original physical volumes reduces the total number of data blocks inthe physical volume map without exceeding a total number of allowedblocks on the different physical volume.

If mapping is not improved at block 324, then the system increments avirtual volume pointer to a next virtual volume, block 326, and themethod 300 iterates at block 318.

If mapping is improved, then the virtual volume is removed from acurrent physical volume list and added to a new physical volume list,and the discrete block from the current physical volume that is usedonly by the virtual volume is removed and added to the new physicalvolume, if the discrete block is not already on the new physical volume,block 328. In an embodiment, the initial virtual volume is deleted fromits corresponding original physical volume and added to the differentoriginal physical volume to create a revised physical volume mapincluding revised physical volumes. In subsequent iterations after theinitial virtual volume is examined (e.g., after a subsequent virtualvolume is examined), the subsequent virtual volume is deleted from itscorresponding original physical volume and added to the differentoriginal physical volume or revised physical volume to create asubsequent revised physical volume map including subsequent revisedphysical volumes.

In any case, a determination is made as to whether the current physicalvolume is empty, block 330. If not, the system increments the virtualvolume pointer to a next virtual volume, block 326, and the method 300iterates at block 318. If the current physical volume is empty, then thecurrent physical volume is removed from the current physical volumelist, block 332, and the method 300 iterates to block 320.

An improved method has now been provided that can be used as an openstandard for writing de-depulicated virtual volumes to a physicalvolume. The method efficiently transfers de-duplicated data that isorganized in virtual volumes to a target set of physical media so thatthe physical media includes only discrete blocks of data for all of thevirtual volumes. Moreover, some embodiments of the method retain hashindex values to reconstitute complete virtual volumes when read from asingle volume of the physical media.

As will be appreciated by one of ordinary skill in the art, aspects ofthe present invention may be embodied as a system, method, or computerprogram product. Accordingly, aspects of the present invention may takethe form of an entirely hardware embodiment, an entirely softwareembodiment (including firmware, resident software, micro-code, etc.) oran embodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module,” or “system.”Furthermore, aspects of the present invention may take the form of acomputer program product embodied in one or more computer-readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer-readable medium(s) may beutilized. The computer-readable medium may be a computer-readable signalmedium or a physical computer-readable storage medium. A physicalcomputer readable storage medium may be, for example, but not limitedto, an electronic, magnetic, optical, crystal, polymer, electromagnetic,infrared, or semiconductor system, apparatus, or device, or any suitablecombination of the foregoing. Examples of a physical computer-readablestorage medium include, but are not limited to, an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk,RAM, ROM, an EPROM, a Flash memory, an optical fiber, a CD-ROM, anoptical storage device, a magnetic storage device, or any suitablecombination of the foregoing. In the context of this document, acomputer-readable storage medium may be any tangible medium that cancontain, or store a program or data for use by or in connection with aninstruction execution system, apparatus, or device.

Computer code embodied on a computer-readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wired, optical fiber cable, radio frequency (RF), etc., or any suitablecombination of the foregoing. Computer code for carrying out operationsfor aspects of the present invention may be written in any staticlanguage, such as the “C” programming language or other similarprogramming language. The computer code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, or communication system, including, but notlimited to, a local area network (LAN) or a wide area network (WAN),Converged Network, or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).

Aspects of the present invention are described above with reference toflow diagrams and/or block diagrams of methods, apparatus (systems) andcomputer program products according to embodiments of the invention. Itwill be understood that each block of the flow diagrams and/or blockdiagrams, and combinations of blocks in the flow diagrams and/or blockdiagrams, can be implemented by computer program instructions. Thesecomputer program instructions may be provided to a processor of ageneral purpose computer, special purpose computer, or otherprogrammable data processing apparatus to produce a machine, such thatthe instructions, which execute via the processor of the computer orother programmable data processing apparatus, create means forimplementing the functions/acts specified in the flow diagram and/orblock diagram block or blocks.

These computer program instructions may also be stored in acomputer-readable medium that can direct a computer, other programmabledata processing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer-readablemedium produce an article of manufacture including instructions whichimplement the function/act specified in the flow diagram and/or blockdiagram block or blocks. The computer program instructions may also beloaded onto a computer, other programmable data processing apparatus, orother devices to cause a series of operational steps to be performed onthe computer, other programmable apparatus or other devices to produce acomputer implemented process such that the instructions which execute onthe computer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flow diagram and/orblock diagram block or blocks.

The flow diagrams and block diagrams in the above figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the present invention. In this regard, each block in theflow diagrams or block diagrams may represent a module, segment, orportion of code, which comprises one or more executable instructions forimplementing the specified logical function(s). It should also be notedthat, in some alternative implementations, the functions noted in theblock may occur out of the order noted in the figures. For example, twoblocks shown in succession may, in fact, be executed substantiallyconcurrently, or the blocks may sometimes be executed in the reverseorder, depending upon the functionality involved. It will also be notedthat each block of the block diagrams and/or flow diagrams, andcombinations of blocks in the block diagrams and/or flow diagram, can beimplemented by special purpose hardware-based systems that perform thespecified functions or acts, or combinations of special purpose hardwareand computer instructions.

1-11. (canceled)
 12. A physical computer storage medium comprising acomputer program product method for transferring de-duplicated data thatis organized in virtual volumes to a target set of physical media, thephysical computer storage medium comprising: computer code for formingan initial bitmap from the de-duplicated data on virtual volumes, thede-duplicated data on virtual volumes comprising a total number ofdiscrete blocks; computer code for sorting the discrete blocks accordingto frequency of occurrence to form a revised bitmap to first include aplurality of most common discrete blocks; computer code for creating aphysical volume map from the revised bitmap, the physical volume mapassociating each discrete block of the de-duplicated data on a virtualvolume with an original physical volume and including a total set oforiginal virtual volumes and total number of data blocks; computer codefor reviewing, from the physical volume map, a first virtual volume ofthe virtual volumes contained on a corresponding original physicalvolume, to determine whether moving the first virtual volume from itscorresponding original physical volume to a different physical volume inthe total set of original physical volumes reduces the total number ofdata blocks in the physical volume map without exceeding a total numberof allowed blocks on the different physical volume; computer code fordeleting the first virtual volume from its corresponding originalphysical volume and adding the first virtual volume to the differentoriginal physical volume to create a revised physical volume mapincluding revised physical volumes; and computer code for writing therevised physical volumes to the target set of physical media using therevised physical volume map.
 13. The physical computer storage medium ofclaim 12, further comprising: computer code for sorting each originalphysical volume associated with the physical volume map by correspondingblock bitmaps into descending order first including a plurality of mostcommon data blocks to a least common data block to form the physicalvolume map.
 14. The physical computer storage medium of claim 13,further comprising: computer code for identifying a subset of datablocks in the plurality of data blocks, the subset of data blocks havinga common bit pattern and comparing portions of each data block in thesubset not including the common bit pattern.
 15. The physical computerstorage medium of claim 12, further comprising: computer code forcomparing a bitmap of the initial virtual volume with each bitmap ofeach original physical volume in the total set of original physicalvolumes.
 16. The physical computer storage medium of claim 12, furthercomprising: computer code for reviewing, from the revised physicalvolume map, a subsequent virtual volume of the virtual volumes containedon a corresponding original physical volume, to determine whether movingthe subsequent virtual volume from its corresponding original physicalvolume to a different original physical volume or revised physicalvolume reduces the total number of data blocks in the revised physicalvolume map without exceeding a total number of allowed blocks on thedifferent original physical volume or revised physical volume; andcomputer code for deleting the subsequent virtual volume from itscorresponding original physical volume and adding the subsequent virtualvolume to the different original physical volume or revised physicalvolume to create a subsequent revised physical volume map includingsubsequent revised physical volumes.
 17. The physical computer storagemedium of claim 12, further comprising: computer code for identifyingconsolidation targets in the virtual volumes, wherein the consolidationtargets comprise common data blocks, and determining whether moving thecommon data blocks from corresponding original physical volumes todifferent physical volumes in the total set of original physical volumesreduces the total number of data blocks in the physical volume mapwithout exceeding a total number of allowed blocks on the differentphysical volume.
 18. The physical computer storage medium of claim 12,further comprising: computer code for repeating the steps of reviewingand deleting until a predetermined number of iterations are performedwithout an occurrence of a deletion of a reviewed physical volume. 19.The physical computer storage medium of claim 18, further comprising:computer code for deleting an empty physical volume, after the step ofrepeating.
 20. The physical computer storage medium of claim 12, furthercomprising: computer code for repeating the steps of reviewing anddeleting for a predetermined duration of time.
 21. The physical computerstorage medium of claim 12, further comprising: computer code forrepeating the steps of forming an initial bitmap, sorting the discreteblocks, creating a physical volume map from the revised bitmap,reviewing, and deleting until a predetermined percentage of a totalnumber of original physical volumes is achieved.
 22. The physicalcomputer storage medium of claim 12, further comprising: computer codefor repeating the steps of forming an initial bitmap, sorting thediscrete blocks, creating a physical volume map from the revised bitmap,reviewing, and deleting until a predetermined percentage above an idealnumber of physical volumes is achieved, wherein the ideal number isdefined as each discrete block being written out to only one completelyfull physical volume.